Engineered yeast for cellulosic ethanol production

ABSTRACT

The disclosure provides designer cellulosomes for efficient hydrolysis of cellulosic material and more particularly for the generating of ethanol.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of patent application Ser. No.13/129,301, filed Aug. 29, 2011, which is a U.S. National StageApplication filed under 35 U.S.C. §371 and claims priority toInternational Application Serial No. PCT/US09/64491, filed Nov. 14,2009, which claims priority under 35 U.S.C. §119 from ProvisionalApplication Ser. No. 61/115,068, filed Nov. 15, 2008, the disclosures ofwhich are incorporated herein by reference.

TECHNICAL FIELD

The disclosure provides designer cellulosomes. The disclosure alsoprovides methods for efficient hydrolysis of cellulosic material andmore particularly for the generating of ethanol.

BACKGROUND

Several billion gallons of renewable fuel must be produced by 2012 withmost of that produced as biofuel using renewable biomass. In particular,bioethanol from renewable sources provides an attractive form ofalternative energy. It has been estimated that the amount of ethanolneeded as transportation fuel will reach 7.5 billion gallons. However,the total capacity of ethanol production in this country is only about4.2 billion gallons, significantly lower than the required amount.

SUMMARY

The disclosure provides a synthetic yeast consortium for directfermentation of cellulose to ethanol with productivity, yield and finalconcentration close to that from glucose fermentation. The engineeringstrategy described herein uses the efficiency of hydrolysis and synergyamong multi-cellulases. To emulate the success of a natural cellulosehydrolysis mechanism, a complex cellulosome structure is assembled on ayeast cell surface using a constructed yeast consortium, which enablesthe ethanol-producing strains to utilize cellulose and concomitantlyferment it to ethanol. More importantly, by organizing these cellulasesin an ordered structure, enhanced synergy increases the hydrolysis, andthereby the production of ethanol.

The disclosure provides a culture comprising: a first recombinant yeaststrain comprising an anchoring scaffoldin (anScaff); a secondrecombinant yeast strain comprising an adaptor scaffoldins comprising aplurality of cohesin domains and at least one cellulose binding domain(CBD); and at least one recombinant yeast strain comprising a pluralityof secreted dockerin-tagged cellulases. In one embodiment, the yeaststrains are cultured under conditions wherein the anchoring scaffoldin,the adaptor scaffoldin comprising the cohesion domains and the pluralityof dockerin-tagged cellulases associated to generate an engineeredcellulosome. In yet another embodiment, the cellulases are selected fromendoglucanases, exoglucanases, β-glucosidase, and xylanase. In a furtherembodiment, the dockerin-tagged cellulase is engineered to comprise aleader sequence for secretion of the dockerin-tagged cellulase.

The disclosure provides a recombinant yeast strain comprising aheterologous polynucleotide encoding an anchoring scaffoldin.

The disclosure also provides a recombinant yeast strain comprising aheterologous polynucleotide encoding an adaptor scaffoldin comprising aplurality of cohesin domains and at least one cellulose binding domain(CBD).

The disclosure provides a recombinant yeast strain comprising at leastone heterologous polynucleotide encoding a secreted dockerin-taggedcellulase.

The disclosure provides a culture comprising a recombinant yeast strainat least two yeast strains comprising a portion of a functionalcellulosome, wherein upon co-culture a functional cellulosome isgenerated.

The disclosure also provides a yeast culture comprising at least tworecombinant strains of yeast wherein the culture produces a designercellulosome, and wherein the yeast culture catabolizes cellulosicmaterial to produce a biofuel.

The disclosure also provides a method of producing a biofuel comprising:culturing the yeast of any as described above in a fermentation brothcomprising a cellulosic material, wherein the microorganism produces thebiofuel metabolite.

The disclosure further provides a method of designing a cellulosomecomprising identifying the cellulosic substrate, identifying at leastone enzyme useful for degradation of the cellulosic material,recombinantly engineering a dockerin peptide to the enzyme, cloning apolynucleotide encoding the dockerin-linked enzyme into a microorganism,culturing the microorganism in a culture of at least one additionalmicroorganism expressing a scaffoldin having a plurality of cohesiondomains and a cellulosic binding domain, wherein the cohesion anddockerin a compatible and culturing the microorganisms to express thescaffoldin and dockerin-linked enzymes.

The details of one or more embodiments of the invention are set forth inthe accompanying drawings and the description below. Other features,objects, and advantages of the invention will be apparent from thedescription and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1A-B shows functional assembly of minicellulosomes on the yeastcell surface. A trifunctional scaffoldin (Scaf-ctf) consisting of aninternal CBD flanked by three divergent cohesin (C) domains from C.thermocellum (t), C. cellulolyticum (c), and R. flavefaciens (f) wasdisplayed on the yeast cell surface. Three different cellulases (E1, E2,and E3) fused with the corresponding dockerin domain (either Dt, Dc, orDf) were expressed in E. coli. Cell lysates containing these cellulaseswere mixed with yeast cells displaying Scaf-ctf for the functionalassembly of the minicellulosome.

FIG. 2A-D shows phase-contrast and immunofluorescence micrographs ofyeast cells displaying minicellulosomes. (A) Cells displaying eitherscaffoldin Scaf-c, Sacf-ct, or Sacf-ctf. Functional assembly of threedockerin-tagged cellulases (CelE-Dc [Ec], CelA-Dt [At], or CelG-Df [Gf])on cells displaying (B) Sacf-ctf, (C) Sacf-ct, or (D) Scaf-c. Cells wereprobed with either anti-c-Myc or anti-c-His6 serum and fluorescentlystained with a goat anti-mouse IgG conjugated with Alexa Fluor 488.Cells displaying only the scaffoldins were used as controls.

FIG. 3 shows fluorescence intensity of cells either displayingscaffoldin Sacf-ctf or with different combinations of dockerin-taggedcellulases (At [A], Ec [E], and Gf [G]) docked on the displayedSacf-ctf. Cells were probed with either anti-c-Myc or anti-c-His6 serumand fluorescently stained with goat anti-mouse IgG conjugated with AlexaFluor 488. Whole-cell fluorescence was determined with a fluorescencemicroplate reader. Cells displaying only Scaf-ctf were used as controls.RFU, relative fluorescence units.

FIG. 4 shows a graph of whole-cell hydrolysis of CMC by differentcellulase pairs (CelE-Dc [Ec], CelA-Dt [At], or CelG-Df [Gf]) docked onthe displayed Scaf-ctf protein. Cells displaying only Scaf-ctf were usedas controls.

FIG. 5A-D show graphs of cellulosome activity. Production of glucose (A)and reducing sugars (B) from the hydrolysis of PASC by free enzymes andby surface-displayed cellulosomes. Reactions were conducted either withdifferent cellulase pairs (CelE-Dc [Ec], CelA-Dt [At], or_-glucosidase-Df [BglA]) docked on the displayed Scaf-ctf protein orwith the corresponding purified cellulases. Cells displaying onlyScaf-ctf were used as controls. (C) Activity associated with cells and(D) activity in the medium at different initial OD ratios.

FIG. 6A-B shows time profiles of ethanol production (A) and cellulosehydrolysis (B) from PASC by control strain EBY100 plus free enzymes andyeast cells displaying functional cellulosomes. Fermentations wereconducted either with different cellulase pairs (CelE-Dc [Ec], CelA-Dt[At], or β-glucosidase-Df [BglA]) docked on cells displaying Scaf-ctf orwith control strain EBY100 plus the corresponding purified cellulases.Cells displaying only Scaf-ctf were used as controls. The individualenzyme amounts were the same in all cases.

FIG. 7 shows a synthetic consortium for the display of complexcellulosomes.

DETAILED DESCRIPTION

As used herein and in the appended claims, the singular forms “a,”“and,” and “the” include plural referents unless the context clearlydictates otherwise. Thus, for example, reference to “a probe” includes aplurality of such probes and reference to “the primer” includesreference to one or more primers and equivalents thereof known to thoseskilled in the art, and so forth.

Also, the use of “or” means “and/or” unless stated otherwise. Similarly,“comprise,” “comprises,” “comprising” “include,” “includes,” and“including” are interchangeable and not intended to be limiting.

It is to be further understood that where descriptions of variousembodiments use the term “comprising,” those skilled in the art wouldunderstand that in some specific instances, an embodiment can bealternatively described using language “consisting essentially of” or“consisting of.”

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood to one of ordinary skill inthe art to which this disclosure belongs. Although any methods andreagents similar or equivalent to those described herein can be used inthe practice of the disclosed methods and compositions, the exemplarymethods and materials are now described.

All publications mentioned herein are incorporated herein by referencein full for the purpose of describing and disclosing the methodologies,which are described in the publications, which might be used inconnection with the description herein. The publications discussed aboveand throughout the text are provided solely for their disclosure priorto the filing date of the present application. Nothing herein is to beconstrued as an admission that the inventors are not entitled toantedate such disclosure by virtue of prior disclosure.

Biomass represents an inexpensive feedstock for sustainable bioethanolproduction. Among the three biological events that occur during theconversion of cellulose to ethanol, i.e., enzyme production,polysaccharide hydrolysis, and sugar fermentation, cellulose hydrolysisis widely recognized as the key step in making bioconversioneconomically competitive. In addition, it is believed that a significantcost reduction can be achieved when two or more steps are combined, suchas in CBP. To achieve this goal, the disclosure provides methods andcellular compositions comprising a functional assembly of aminicellulosome on a yeast cell surface. The minicellulosome wasengineered to render the ethanologenic microbe cellulolytic.

In one embodiment, the disclosure achieves the method and composition byfirst engineering a chimeric minicellulosome containing three dockerincohesion pairs from different species on the yeast cell surface.Immunofluorescence microscopy showed the successful translocation of theminiscaffoldin on the yeast cell surface, and the functionality of thecohesin domains was retained by observing the successful assembly of thecorresponding dockerin-tagged cellulases. Since the specificity of thedockerin-cohesin pairs is preserved, it is possible to direct anyenzymatic subunit to a specified position within a modular scaffoldin bytagging with the designated dockerin.

The disclosure further demonstrates a synergistic effect on cellulosehydrolysis compared with that of free enzymes.

In the compositions and methods of the disclosure the displayedminicellulosome retained this key characteristic. Interestingly, thelevel of synergy increased with an increasing number of cellulasesdocked on the cell surface. This synergistic effect was preserved evenwhen a new minicellulosome comprising a β-glucosidase (BglA), anendoglucanase (At), and an exoglucanase (Ec) was assembled on the yeastcell surface.

The disclosure further demonstrates that the methods and compositions ofthe disclosure are useful at ethanol production. Cellulose hydrolysisand ethanol production were tested with both free enzymes and adisplayed minicellulosome. Independent of the number of cellulasesincorporated in the minicellulosome, similar levels of enhancement ofcellulose hydrolysis, as well as ethanol production, were detected. Theethanol production achieved, in particular, was more than 2.6-foldhigher than that of the culture in which all three cellulases were addedas free enzymes. This, when combined with an ethanol yield close to 95%of the theoretical maximum, makes this an efficient process for directfermentation of cellulose to ethanol.

Current production processes for using crops such as sugar cane andcornstarch for bioethanol production are well established. However,since the cost of raw materials can be as high as 40% of the overallprocess, utilization of a cheaper substrate would render bioethanol morecompetitive with fossil fuel (Zaldivar et al., 2001). Among thedifferent forms of biomass, lignocellulosic biomass is particularlywell-suited for energy applications because of its large-scaleavailability, low cost and environmentally benign production (Lynd etal, 1999). This natural and abundant polymer is found as agriculturalwaste (wheat straw, corn stalks, soybean residues), industrial waste(pulp and paper industry), forestry residues, and municipal solid waste.Many energy production and utilization cycles based on cellulosicbiomass have near-zero greenhouse gas emissions on a life-cycle basis(Lynd et al, 2005).

The primary obstacle impeding the more widespread production of energyfrom biomass is the absence of a low-cost technology for overcoming therecalcitrance of these materials (Lynd et al, 2008). For cellulose to beamenable to fermentation, it needs to undergo several treatments torelease its monomeric sugars (Zaldivar et al, 2001). Two main steps are:(1) pre-treatments that remove lignin and exposes cellulose forenzymatic degradation, and (2) an enzymatic treatment to generateglucose from cellulose before fermentation. The high cost of cellulasesneeded for cellulose hydrolysis is one of the major obstacles in thequest for an economically feasible cellulose-based bioethanol process(McBride et al., 2005).

Although the cost of bioethanol production can become more competitiveby combining the hydrolysis and fermentation steps in simultaneoussaccharification and cofermentation (SSCF) of both hexoses and pentoses,it has been shown that the overall cost can be even further reduced by4-fold using a one-step “consolidated” bioprocessing (CBP) oflignocellulose to bioethanol, where cellulase production, cellulosehydrolysis and sugar fermentation can be mediated by a singlemicroorganism or microbial consortium. An ideal microorganism for CBPshould possess the capability of simultaneous cellulose saccharificationand ethanol fermentation. One attractive candidate is Saccharomycescerevisiae, which is widely used for industrial ethanol production dueto its high ethanol productivity and high inherent ethanol tolerance.

However, due to energetic limitations under anaerobic conditions, only alimited amount of cellulases can often be secreted, resulting inrelatively low rates of cellulose hydrolysis. It is believed that, for aprocess to be viable economically, it must have productivity greaterthan 1 g/l/hr (Zaldivar et al, 2001). For example, where cellulases aredisplayed on yeast surface the productivity (0.075 g/l/hr) was more thanone order of magnitude lower than strains fermenting glucose.

Unfortunately, substantial improvement in cellulose hydrolysis may notcome from simply increasing the amount of enzymes secreted to the mediumor displayed on the surface, which is obviously limited under anaerobicconditions. The key to improving hydrolysis is, perhaps, to increase thecatalytic efficiency by maximizing the synergy with limited amount ofenzymes. Recently, it has been demonstrated that the use of ternarycellulose-enzyme-microbe complexes yields much higher rates of cellulosehydrolysis than using binary cellulose-enzyme complexes (Lu et al.,2006). This enzyme-microbe synergy requires the presence ofmetabolically active Clostridium thermocellum displaying cellulosome andappears to be a surface phenomenon involving microbial adhesion onto thecellulose. The 2 to 4-fold synergistic effect observed is significant indecreasing the cost for cellulose hydrolysis.

An understanding of the role of cellulosomes can be viewed as twodistinct mechanisms to tackle the recalcitrant cellulose. Aerobicmicrobes (such as Trichoderma reesei) produce copious amounts of solublehydrolytic enzymes that synergistically breakdown cellulosic materials(Wilson, 2004; Bayer et al., 2000). In contrast, anaerobic organisms,due to energetic constraints, can only produce a limited amount ofenzymes. Therefore, in response, anaerobic organism have become moreefficient and have developed an elaborately structured enzyme complex,called a cellulosome, to maximize catalytic efficiency (Bayer et al.,2004; Doi and Kosugi, 2004; Demain et al., 2005). This self-assembledsystem brings multiple enzymes in close proximity to the substrate, andprovides a structure that ensures high local concentration and thecorrect ratio and orders of the enzymes, thereby maximizing synergy.Consequently, it has a much higher catalytic efficiency than solubleenzymes present in a non-organized fashion. A study showed that thestructure endowed an enzyme activity increase of up to 50 times (Johnsonet al., 1982).

Cellulosomes are self-assembled multi-enzyme complexes presented on theanaerobes' cell surface and are dedicated to cellulose depolymerization.The major component of these macromolecule complexes is a structuralscaffoldin consists of at least one cellulose binding domain (CBD) andrepeating cohesin domains, which are docked individually with acellulase tagged with a dockerin domain (FIG. 1). The CBD serves as atargeting agent to direct the catalytic domains to the cellulosesubstrates. The specific protein-protein, or complementarycohesin-dockerin interaction, provides the mechanism forposition-specific self-assembly. Within a given species, the dockerincomponent appears to bind to all of the cohesins with similar affinity,thus suggesting a random incorporation of the enzymes in thecellulosome. The relative abundance of the catalytic subunits in thecellulosomes is assumed to reflect the level of expression of thecorresponding genes as such as in the case of C. cellulolyticum using agenetic approach. These cohesin and dockerin modules arespecies-specific and do not cross interact (Carvalho et al., 2005; Pageset al., 1997). However, recently studies confirmed the presence of otherType II and Type III cohesin/dockerin pairs within a given species inadditional to the original Type I mentioned above. These additionalpairs are structurally very different and have been shown to possessdifferent specificities to the Type I cohesin/dockerin pairs (Haimovitzet al., 2008).

The multi-enzyme complex attaches both to the cell envelope and to thesubstrate, mediating the proximity of the cells to the cellulose(Schwarz, 2001). The ability for substrate-targeting is one of thereasons for increased catalytic efficiency. In addition, the productionof cellulosome has a number of advantages over soluble enzymes, forhydrolysis of cellulose:

1. Synergism is optimized due to the proximity of enzyme components,

2. Non-productive adsorption is avoided by the optimal spacing ofcomponents,

3. Competitiveness in binding to a limited number of binding sites isavoided by binding the whole cellulosome complex to a single sitethrough a strong binding domain with low specificity,

4. A halt in hydrolysis on depletion of one structural type of celluloseat the site of adsorption is avoided by the presence of other enzymeswith different specificity.

The disclosure provides a recombinant yeast expressing cellulosomalstructures. Bioenergetic benefits and synergy are achieved when thecellulosomal structures are displayed onto a microorganism having theability to ferment biomass to ethanol such as yeast (e.g. S.cerevisiae). The recombinant organisms of the disclosure are useful forbioethanol production as fewer enzymes are needed. Additionally, sinceglucosidase is typically subjected to product inhibition, presence ofactive glucose-metabolizing cells should further increase the overallhydrolysis rate. The disclosure provides engineered yeast comprising aconsortium capable of displaying the highly efficientcellulose-degrading cellulosome structures for one-step CBP ofcellulosic materials.

The functional presentation of various cellulose-binding domains andcatalytic subunits in a cellulosome provides improved cellulosehydrolysis over free enzymes as a consequence of the synergistic actionamong the different components. Because of the modular nature of thecellulosomal subunits the disclosure provides artificial cellulosomesuseful in generating biofuels from aerobic organisms at efficiencysimilar to anaerobic organisms. For example, the disclosure demonstratesthat usefulness of a recombinant CBD in yeast using a trifunctionalchimeric scaffoldin containing cohesins from a plurality of species. Thetrifunctional chimeric scaffoldin was constructed and each type ofcohesion module was shown to bind specifically to the correspondingdockerin-borne cellulolytic enzymes in vitro. The resulting 6-foldimprovement in cellulose hydrolysis over similar free enzymesdemonstrates that the “designer cellulosome” of the disclosure can besimilarly exploited for whole-cell hydrolysis of cellulose and ethanolproduction when expressed by yeast (e.g., S. cerevisiae). The disclosuredemonstrates that by displaying a mini-scaffoldin onto the yeast surfacea designer cellulosome was obtain. In one embodiment, the resultingyeast cells when tagged with three different dockerin-tagged cellulaseswere shown to degrade cellulose up to 3-fold faster.

In yet another embodiment, the disclosure provides a yeast consortiumcomposed of strains with a surface-display anchoring scaffoldin, strainssecreting an adapting scaffoldin, and strains secreting dockerin-taggedcellulases (FIG. 2) for the functional presentation of the complexcellulosome structures.

The disclosure demonstrates a synthetic yeast consortium for functionalpresentation of the complex cellulosome structures and to demonstratethe ability for enhanced ethanol production from cellulose. Thedisclosure provides recombinant yeast for ethanol production comprisinga plurality of cohesions and dockerin polypeptides. Currently, thesequences of more than one hundred different cohesions and dockerinsfrom a dozen cellulosome-producing bacteria are known (Hamiovitz et al.,2008). In one embodiment, the disclosure provides the use of 2, 3, 4, 5,6, 7, or 8 different cohesin/dockerin pairs for the cellulosomeassembly.

The disclosure provides a synthetic yeast consortium for directfermentation of cellulose to ethanol with productivity, yield and finalconcentration close to that from glucose fermentation. The engineeringstrategy described herein uses the efficiency of hydrolysis and synergyamong multi-cellulases, rather than focusing on the amount of enzymesproduced or used. To emulate the success of a natural cellulosehydrolysis mechanism, a complex cellulosome structure is assembled on ayeast cell surface using a constructed yeast consortium, which enablesthe ethanol-producing strains to utilize cellulose and concomitantlyferment it to ethanol. More importantly, by organizing these cellulasesin an ordered structure, the enhanced synergy will increase thehydrolysis, and thereby the production of ethanol.

In one embodiment, the disclosure provides a yeast consortium forsurface assembly of a mini-cellulosome structure comprising 2, 3 or morecellulases and demonstrates the feasibility of using the consortium fordirect ethanol production from cellulose. In another embodiment, thedisclosure provides recombinant yeast for surface-display of one or moreof the anchoring scaffoldin, the adaptor scaffoldin, and thedockerin-tagged cellulases.

In one embodiment, the yeast consortium provides a cellulosomecomprising a CelA from Orpinomyces sp strain PC-2 (see, e.g., Ljungdahl,2008; Chen et al., 2006, which are incorporated herein by reference),CelC from Orpinomyces sp strain PC-2, CelB, CelD, XynA, and 1B-glucosidase (Voorhorst, 1995). Sequences for the various dockerins,cohesions and cellulases used in the methods and compositions of thedisclosure are readily identifiable by one of skill in the art withreference to readily available databases (e.g., GenBank).

In one embodiment, a strain of yeast can be genetically engineered byrecombinant DNA techniques to express a polynucleotide comprising one ormore structural component for producing a cellulosome, a polynucleotidecomprising a structural element linked to a cellulose degrading enzyme(e.g., a cellulase), or a combination of any polynucleotide encoding acellulosome structural polypeptide or enzyme. Where a single yeast ormicroorganism does not express a full complement of structural andenzyme polypeptide for production of a complete cellulosome, acombination of two or more recombinant yeast (e.g., a consortium) can beused wherein the two or more recombinant yeast express portions of acellulosome that upon combination generate a full cellulosome capable ofdegrading a cellulose material.

For example, in one embodiment, a yeast can be recombinantly engineeredto express a trifunctional scaffoldin comprising an internal CBD flankedby three divergent cohesin domains. A first strain recombinant yeastwill comprise a plasmid or vector containing one or more (e.g.,continuous or separated by linking domains) polynucleotide(s) encodingthe CBD flanked by the three divergent cohesin domains. Thepolynucleotide sequences for a large number of cohesin domains and theircorresponding dockerin domains are known in the art. For example, asearch of GenBank will identify numerous sequences for cohesinpolynucleotide and polypeptides.

A second yeast strain can be recombinantly engineered to contain aplasmid or vector comprising sequence encoding one or more dockerindomains linked to a cellulose degrading enzyme. Upon co-culture eachstrain expresses the corresponding structural or structural-enzymaticpolypeptides. The respective dockerin and cohesin domains bind to oneanother and form a mini-cellulosome. Again, dockerins and enzymes usefulin the methods and compositions of the disclosure are recognized in theart and the corresponding sequences are readily identifiable to one ofskill in the art performing a simple search on GenBank. For example, thedisclosure provides dockerin-enzyme fusion constructs comprising SEQ IDNOs: 5-7. It will be recognized that variants of each of the enzymedomains of the fusion construct may be used, variants of each of thedockerin domains may be used. For example, polypeptides having 80%-99%identity to SEQ ID NO:6 or 8 (or 80-99% [85%, 90%, 95%, 97%, 98% etc.]identity to the respective enzyme or dockerin domains of the construct)can be used in the methods and compositions of the disclosure so long asthe dockerin domain is capable of binding to its respective cohesindomain and the enzyme domain is capable of degrading a cellulosematerial. Methods for determining percent identity are well known in theart.

A variety of tools are available for the introduction and expression ofgenes in yeast (such as S. cerevisiae), including established vectorseries (Gietz and Sugino, 1988; Sikorski and Hieter, 1989) andsimultaneous or sequential gene integration vectors for the insertion ofmultiple genes (Wang and Da Silva, 1996; Parekh et al., 1996; Lee and DaSilva, 1997; Lee and Da Silva, 2007).

Plasmids designed to enable combinatorial plasmid-based testing andseamless transition to genomic integration for fine-tuning of genenumber and stable long-term expression can be used. This allows rapidand stable strain construction relative to previous methods. The set of16 plasmids combines four different marker genes (flanked by loxPsites), two different promoters, and two different replication origins(2μ, CEN/ARS). The autonomous vectors allow initial testing of genecombinations on high and/or low copy plasmids. For insertion into thechromosomes, expression polylinker sites adjacent to the selectablemarkers facilitate polymerase chain reaction amplification of the testgene and selectable marker using primers with outside ends in thedesired genomic target sequences. Using this strategy, genes have beensuccessfully inserted into a set of unique locations in the genome withknown expression level. The loxP-mediated excision of the selectionmarker (Sauer, 1987) allows simultaneous marker excision after a groupof genes has been integrated, this set of vectors enables both rapidconstruction and testing of strains, and development of stableengineered strains for use in any complex medium.

For example, using the methods described herein an engineered consortiumfor cellulose hydrolysis by intercellular complementation is providedand useable in any given ecosystem. The disclosure demonstrates thefeasibility of using a yeast consortium for the surface-display of amini-cellulosome for efficient cellulose hydrolysis. The disclosuredemonstrates the correct assembly of secreted At onto the Scaff#3 in aco-culture system. To enable surface display of scaffoldins withoutgalactose induction, expression of the surface anchor AGA1 (FIG. 1) isplaced under a constitutive PGK promoter and the 2 copies of the genecassette integrated.

Various enzymes can be used for to degrade the cellulose material. Forexample, it has been shown that β-glucosidase is useful for completedegradation of cellulose to glucose. Therefore a well knownβ-glucosidase BglA from C. thermocelum was tagged with the dockerindomain (Bf), produced in E. coli and incorporated into the cellulosomestructures. The result demonstrates the enhancement effect on glucoseliberation by assembling BglA into the mini-cellulosome. Therefore, forthe initial demonstration, a mini-cellulosome containing anendoglucanase (At), an exoglucanase (Ec) and BglA are assembled.Secretion of At into the medium using the secretion leader sequence MFα1was used; a similar strategy is employed for the secretion of Ec and Bf.Secretion of the structure-enzyme can be confirmed using various methodsin the art. For example, secretion of Ec and Bf into the medium can beconfirmed by Western blotting using a FLAG tag on Ec and a S-tag on Bf.After confirming secretion, the activity of the secreted fusionconstructs can be assayed. For example, cellulase activity can bedetermined using Avicel or cellobiose as the substrate. Finally, cellssecreting either Ec and Bf are co-cultured with cells displaying Scaff#3and the correct assembly of the cellulases onto the cell surface can beconfirmed by immunofluorescence microscopy and the ability to hydrolyzeAvicel or cellobiose. After confirming the assembly of individualsecreted cellulases, the feasibility of assembling the completemini-cellulosome is performed. To begin with, a co-culture of differentyeast strains are tested in SDC medium. The correct assembly of allthree cellulases is confirmed by immunofluoresence using a unique tagpresented on each one.

In yet another embodiment, a single yeast strain capable of secretingall Ec, At and Bf and a strain displaying Scaff#3 is provided by takingadvantage of the rapid sequential integration approach. This method alsoallows precise regulation of expression by controlling the integratedgene copy number (1 to 5). A small-scale shake-flash cultures are usedfor varying the initially inoculation cell density from a ratio of 10 to0.1. The specific culturing conditions that result in the highest numberof fluorescence cells with all three tags can then be used. The abilityof the consortium to hydrolyze Avicel and the corresponding ethanolproduction can be measured using standard techniques. For example, cellsare grown aerobically in SD medium using glucose as the carbon source.The resuspended cells are then used in anaerobic fermentation (SDCmedium) using Avicel or phosphoric acid swollen cellulose (PASC) as thecarbon source. Samples of the culture can then be obtain for monitoringthe expression level, reducing sugar, intermediates, and cell growth.The ratio of the two different cell populations can be modified tomaximize synergy. A coordinated gene expression system leads to thedetection of only glucose, whereas accumulation of other productsindicate the level of secreted enzymes (or cell population) should beadjusted. For example, if cellobiose is found to accumulate, itindicates that the glucosidase activity is too low relative to the otherenzymes and the problem could be easily solved with this strategy bysimply increasing the gene dosage for the secreted glucosidase.

To achieve the cellulosome structure as shown in FIG. 7, five additionalcohesin/dockerin pairs can be used. Since most of them are speciesspecific, three additional cohesion/dockerin pairs from B.cellulosolvens (bc), Ace. celluloyticus (ac), and C. acetobutylicum (cc)can be used. In addition, Type II cohesion/dockerin pairs fromClostridium thermocelum (T) and Clostridium celluloyticum (C) are usedin conjunction with the Type I pairs (c, t, and f). A feature of thedesign is the common display of an anchoring scaffoldin, which willenable the surface-display of the complex cellulosome onto all cellsexcept for the strain designed to secrete the adaptor scaffoldins. Sincethe dockerins used on the cellulolytic enzymes have no cross affinitywith the cohesions on the anchoring scaffoldin, no interaction orinterference with the translocation machinery is expected. As a result,the resulting consortium will be comprised of cells displaying afunctional cellulosome with virtually no carbon source wasted.

To construct the synthetic consortium, different strains are generated.First, a yeast strain designed to display an anchoring scaffoldin(anScaff) containing the cohesin domains from bc and cc is provided.Synthetic oligos coding for two cohesin domains are synthesized and usedfor plasmid construction. These two domains are joined by a 10 aminoacid linker containing GS repeats flanked by a FLAG tag and displayed onthe yeast surface using the same GPI anchor. To add the adapterscaffoldins onto the displayed anScaff, a yeast strain designed tosecrete two separate adaptor scaffoldins is provided. The first adaptorscaffoldin (adScaff#1) comprises an N-terminus be dockerin flanked bythree cohesin domains (t, f, c), one CBD domain, and a C-myc tag. Thesecond adaptor scaffoldin (adScaff#2) comprises an N-terminus ccdockerin flanked by three cohesin domains (ac, type II t and type II c),one CBD domain, and a S-tag. Finally, two different strains are providedto secrete three different dockerin-tagged cellulases each (2endoglucanases, 2 exoglucanases, one β-glucosidase, and one xylanase)secreted using the MFβ1 leader sequence. A His6 tag is added to eachcellulase. In this configuration, the consortium will comprise fourstrains and three will have the complex cellulosome displayed on thesurface.

In addition to At, Ec, and BglA used above, other enzymes from anaerobicfungi that form cellulosome can be used to demonstrate the complexcellulosome structure. This choice is based on the finding that enzymesfrom these anaerobic fungi have specific activities much higher thanenzymes from other sources (Ljungdahl, 2008). Additionally, several ofthese enzymes were successfully over-expressed in S. cerevisiae(Ljungdahl, 2008; Chen et al., 2006), suggesting over expression ofenzymes may not be a significant obstacle. Furthermore, since theseenzymes are cellulosomal, the structural feature and folding mechanismare likely compatible to the designed structure as compared to othernon-cellulosomal enzymes. For example, a cellulosome of the disclosurecan comprise CelA and CelC from Orpinomyces sp strain PC-2, are both GHfamily 6 enzymes, possessing both endo and exoglucanase activities; twofamily 5 cellulases, CelB and CelD, both endoglucanases, are cloned andfused with an appropriate dockerin; the catalytic domain of XynA, afamily GH11 xylanase with extremely high activity, is used as the fifthenzyme and a family 1 β-glucosidase from Piromyces sp Strain E2 is usedas a sixth enzyme.

All of these strains are constructed using the rapid and stablemulti-gene integration systems described above; 1 to 5 copies of eachgene cassette are integrated in order to optimize the requiredexpression. Expression is under the control of the PGK promoter based onits constitutive nature and the high-level of protein expression. Thenumber of anScaff molecules displayed on the cell surface will bedetermined by measuring the fluorescence intensity of the cell pelletsuspended in PBS buffer (pH 7.0) using a fluorometer. A standard curvecan be prepared by using known amounts of Alexa Fluor 546-conjugatedgoat anti-mouse IgG. The number of anscaff displayed will be calculatedusing (RFU×0.945×10⁷)/1×10⁷ as reported previously. Similarly, thesecretion level for the two anScaffs and the different cellulases can bedetermined. The secretion of anScaff in the medium can be confirmedfirst by incubating the medium with Avicel for 1 h. After incubation,Avicel is separated by centrifugation and the bound protein afterwashing three times with PBS buffer is analyzed by SDS-PAGE and Westernblot against either the S-tag or C-myc tag. The amount of cellulasesproduced is analyzed by SDS-PAGE and enzyme assays using procedures thatare already in place. After confirming expression, the four differentstrains are co-cultured and the correct assembly of the adaptorscaffoldins and cellulases onto the cell surface confirmed byimmunofluorescence microscopy and the ability to hydrolyze Avicel.Xylanse activity can be measured as described by Bailey et al., 1992).

The disclosure demonstrates a synthetic consortium for surface-displayof a complex cellulosome by combining cells displaying the anchoring andadaptor scaffoldins with cells secreting the dockerin-borne cellulases.The modular nature of the cellulosome and great diversity of cellulases,and the availability of gene fusion technology provide almost unlimitednumber of combinations of enzymes and ways to incorporate them intoartificial cellulosome as designed to optimize their activities toapproach or even surpass the natural cellulosome. It has been shown byFierobe et al. (2002; 2005) that cellulosomes containing differentcellulases have significantly different abilities for cellulosehydrolysis; even the same cellulases fused to different dockerins canresult in cellulosomes with substantially different hydrolysisefficiencies (up to 2-3 fold), suggesting the order of enzymes on thecellulosome can directly impact the overall activity. In an effort tocreate the most efficient combination for cellulose hydrolysis, dockerinreplacement among the enzymes can be revised and modulated. All possibledockerin combinations will be created and the resulting activity of thecellulosomes can be compared for hydrolysis. Overlap-extension PCR canbe used for replacements of the dockerins. Briefly the reverse primersfor the region coding for the C-terminal part of the catalytic domain ofeach cellulase can be overlapped with the forward primers for the regioncoding for the N-terminal part of the dockerin domain. After severalruns of denature, aligning, and extension in PCR the resultantoverlapping fragments will be mixed and combined fragment will besynthesized by using external primers. DNS method as described hereincan be used for evaluating the efficiencies of the different cellulosomecomplexes and their synergy effects as well as glucose liberation.

After optimizing the activity of the cellulosome, the ability of theconsortium to hydrolyze Avicel and the corresponding ethanol productioncan be analyzed. Initially, growth rates of each individual strain aredetermined to check for any substantial difference in cell growth.Samples are taken periodically for monitoring the expression level,reducing sugar, intermediates, and cell growth. One can coordinate thefour different cell populations so that maximum synergy can be obtainedand no enzyme represents a limiting step. This can be ensured by usingcellulose hydrolysis and following the hydrolysis products by acarbohydrate analysis method established on a Dionex High-pHIon-Exchange Chromatograph System with electrochemical detector, whichdetects oligosaccharides (cellobiose, glucose and the like), allowingone to pinpoint limiting enzyme activities (if any) (endo/exo glucanasesor β-glucosidase). A coordinated gene expression system leads to thedetection of only glucose, whereas accumulations of other productsindicate the level of secreted enzymes (or cell population) should beadjusted by varying the gene copy number.

The engineered strains can be evaluated for cellulose hydrolysis andethanol production under different conditions such as resting and growthconditions in SDC medium. Both small and large-scale (shaker flask/oneliter bioreactor) studies can be performed. In resting cell experiments,cells are grown aerobically using glucose as the carbon source. Cellsare then washed and used in cellulose anaerobic hydrolysis. Enzymeactivity, the integrity of cellulosome, hydrolysis products, glucose,ethanol will be monitored using methods described herein. In studiescarried out in a fermentor, a mild agitation can be used to promotemixing of solid cellulose material with cells. Once optimized industrialyeast fermentation process may be used. Different celluloseconcentrations can also be used. The rate of glucose generation will beestimated from the experiments and compared to those without cellulosomeon the cell surface (but with comparable enzyme expression levels). Instudies under growing conditions, the cells will be provided celluloseas the sole carbon source, and other nutrients necessary for growth.Anaerobic conditions are maintained. Cell biomass, enzyme activities,cellulosome integrity, any possible accumulated hydrolysis productsincluding glucose, and ethanol are measured.

The disclosure provides yeast strains for direct fermentation ofcellulose to ethanol, eliminating the need for use of purifiedcellulases. The methods and compositions of the disclosure provideabundant, low-cost, agriculture residue to be used as raw material forethanol production. The increased production of ethanol not only reducespollution to the environment but also the need for imported petroleum astransportation fuel. Collectively, the benefits from the inventioninclude at least efficient, economical, and environmentally friendlyconversion of biomass.

Current biofuel production processes are exclusively based upon thesoluble enzyme approach, the less efficient model utilized by aerobicmicrobes. The construction of cellulosomes on the cell surface asdescribed herein departs from the current model of enzymatic hydrolysis.A difference lies in the extent of synergism. While the secretion orco-display of cellulases (without the cellulosome structure) permits asynergistic use of these enzymes to some extent, both follow the modelof aerobic organisms in cellulose hydrolysis. As exemplified byTrichoderma. reesei, this model requires the production of abundantenzymes without needing to maximize the efficiency and synergism. Sinceethanol production is carried out anaerobically, the limited amount ofenzyme production makes the high coordination of different enzymecomponents necessary to maximize the synergy and efficiency.

As used herein, the term “microorganism” includes prokaryotic andeukaryotic microbial species from the Domains Archaea, Bacteria andEucarya, the latter including yeast and filamentous fungi, protozoa,algae, or higher Protista. The terms “microbial cells” and “microbes”are used interchangeably with the term microorganism.

As used herein, the term “polynucleotide” refers to a polymer of nucleicacid bases such as deoxyribonucleic acid (DNA), and, where appropriate,ribonucleic acid (RNA). The term should also be understood to include,analogs of either RNA or DNA made from nucleotide analogs, and, asapplicable to the embodiment being described, single (sense orantisense) and double-stranded polynucleotides.

The term “carbon source” generally refers to a substrate or compoundsuitable to be used as a source of carbon for bacterial or simpleeukaryotic cell growth. Carbon sources may be in various forms,including, but not limited to polymers, carbohydrates such as cellulosicmaterial including cellulooligosaccharides and lignocellulose, acids,alcohols, aldehydes, ketones, amino acids, peptides, etc. These include,for example, various monosaccharides such as glucose, oligosaccharides,polysaccharides, saturated or unsaturated fatty acids, succinate,lactate, acetate, ethanol, and the like, or mixtures thereof.

The term “substrate” or “suitable substrate” refers to any substance orcompound that is converted or meant to be converted into anothercompound by the action of an enzyme. The term includes not only a singlecompound, but also combinations of compounds, such as solutions,mixtures and other materials which contain at least one substrate, orderivatives thereof. Further, the term “substrate” encompasses not onlycompounds that provide a carbon source suitable for use as a startingmaterial, but also intermediate and end product metabolites used in apathway associated with a engineered microorganism as described herein.A “biomass derived sugar” includes, but is not limited to, moleculessuch as glucose, sucrose, mannose, xylose, and arabinose. Exemplarysubstrate sources include alfalfa, corn stover, crop residues, debarkingwaste, forage grasses, forest residues, municipal solid waste, papermill residue, pomace, sawdust, spent grains, spent hops, switchgrass,and wood chips.

As used herein, the terms “gene” and “recombinant gene” refer to anexogenous nucleic acid sequence which is transcribed and (optionally)translated. Thus, a recombinant gene can comprise an open reading frameencoding a polypeptide. In such instances, the sequence encoding thepolypeptide may also be referred to as an “open reading frame”.

“Transcriptional regulatory sequence” is a generic term used throughoutthe specification to refer to DNA sequences, such as initiation signals,enhancers, and promoters, which induce or control transcription of agene or genes with which they are operably linked.

“Operably linked” means that a gene and transcriptional regulatorysequence(s) are connected in such a way as to permit expression of thegene in a manner dependent upon factors interacting with the regulatorysequence(s).

“Exogenous” means a polynucleotide or a peptide that has been insertedinto a host cell. An exogenous molecule can result from the cloning of anative gene from a host cell and the reinsertion of that sequence backinto the host cell. In most instances, exogenous sequences are sequencesthat are derived synthetically, or from cells that are distinct from thehost cell.

The terms “host cells” and “recombinant host cells” are usedinterchangeably herein. It is understood that such terms refer not onlyto the particular subject cell but to the progeny or potential progenyof such a cell. Because certain modifications may occur in succeedinggenerations due to either mutation or environmental influences, suchprogeny may not, in fact, be identical to the parent cell, but are stillincluded within the scope of the term as used herein.

As used herein, a “reporter gene” is a gene whose expression may beassayed; reporter genes may encode any protein that provides aphenotypic marker, for example: a protein that is necessary for cellgrowth or a toxic protein leading to cell death, e.g., a protein whichconfers antibiotic resistance or complements an auxotrophic phenotype; aprotein detectable by a colorimetric/fluorometric assay leading to thepresence or absence of color/fluorescence; or a protein providing asurface antigen for which specific antibodies/ligands are available.

The term “biosynthetic pathway”, also referred to as “metabolicpathway”, refers to a set of anabolic or catabolic biochemical reactionsfor converting (transmuting) one chemical species into another. Geneproducts belong to the same “metabolic pathway” if they, in parallel orin series, act on the same substrate, produce the same product, or acton or produce a metabolic intermediate (i.e., metabolite) between thesame substrate and metabolite end product.

As used herein, the term “metabolic pathway” includes catabolic pathwaysand anabolic pathways both natural and engineered i.e. synthetic.Anabolic pathways involve constructing a larger molecule from smallermolecules, a process requiring energy. Catabolic pathways involvebreaking down of larger molecules, often releasing energy. An anabolicpathway is referred to herein as “a biosynthetic pathway.”

Biofuel is any fuel that derives from biomass—organisms, such as plants,fermentation waste, or metabolic by-products, such as manure from cows.It is a renewable energy source, unlike other natural resources such aspetroleum, coal and nuclear fuels. Agricultural products specificallygrown for use as biofuels and waste from industry, agriculture,forestry, and households—including straw, lumber, manure, sewage,garbage and food leftovers—can be used for the production of bioenergy.

Cellulose is a polymer polysaccharide carbohydrate, of beta-glucose. Itforms the primary structural component of plants and is not digestibleby humans. Cellulose is a common material in plant cell walls and wasfirst noted as such in 1838. Cellulose is the most abundant form ofliving terrestrial biomass (Crawford, R. L. 1981. Lignin biodegradationand transformation, John Wiley and Sons, New York.). Cellulose is alsothe major constituent of paper. Cellulose monomers (beta-glucose) arelinked together through 1,4 glycosidic bonds.

A polynucleotide, polypeptide, or peptides may have a certain percent“sequence identity” to another polynucleotide or polypeptide, meaningthat, when aligned, that percentage of bases or amino acids are the samewhen comparing the two sequences. Sequence similarity can be determinedin a number of different manners. To determine sequence identity,sequences can be aligned using the methods and computer programs,including BLAST, available over the world wide web atncbi.nlm.nih.gov/BLAST. See, e.g., Altschul et al. (1990), Mol. Biol.215:403-10.

Exemplary polynucleotide and polypeptides are provided herein. One ofskill in the art will recognize that minor modification (e.g.,conservative substitutions and the like) can be made to the polypeptidewithout destroying the biological/enzymatic activity of the polypeptide.Such modification, variation and the like are within the skill in theart as it relates to molecular biology. Screening for activity of suchmodified polypeptides is described herein. Accordingly, the disclosureencompass polynucleotide and polypeptides having at least 60%, 70%, 80%,90%, 95%, 98% or 99% identity to a sequence as set forth herein andhaving a biological activity similar to the wild-type molecule.

As will be understood by those of skill in the art, it can beadvantageous to modify a coding sequence to enhance its expression in aparticular host. The genetic code is redundant with 64 possible codons,but most organisms typically use a subset of these codons. The codonsthat are utilized most often in a species are called optimal codons, andthose not utilized very often are classified as rare or low-usagecodons. Codons can be substituted to reflect the preferred codon usageof the host, a process sometimes called “codon optimization” or“controlling for species codon bias.”

Optimized coding sequences containing codons preferred by a particularprokaryotic or eukaryotic host (see also, Murray et al. (1989) Nucl.Acids Res. 17:477-508) can be prepared, for example, to increase therate of translation or to produce recombinant RNA transcripts havingdesirable properties, such as a longer half-life, as compared withtranscripts produced from a non-optimized sequence. Translation stopcodons can also be modified to reflect host preference. For example,typical stop codons for S. cerevisiae and mammals are UAA and UGA,respectively. The typical stop codon for monocotyledonous plants is UGA,whereas insects and E. coli commonly use UAA as the stop codon (Dalphinet al. (1996) Nucl. Acids Res. 24: 216-218). Methodology for optimizinga nucleotide sequence for expression in a plant is provided, forexample, in U.S. Pat. No. 6,015,891, and the references cited therein.

Those of skill in the art will recognize that, due to the degeneratenature of the genetic code, a variety of DNA compounds differing intheir nucleotide sequences can be used to encode a given enzyme of thedisclosure. The native DNA sequence encoding the biosynthetic enzymesdescribed above are referenced herein merely to illustrate an embodimentof the disclosure, and the disclosure includes DNA compounds of anysequence that encode the amino acid sequences of the polypeptides andproteins of the enzymes utilized in the methods of the disclosure. Insimilar fashion, a polypeptide can typically tolerate one or more aminoacid substitutions, deletions, and insertions in its amino acid sequencewithout loss or significant loss of a desired activity. The disclosureincludes such polypeptides with different amino acid sequences than thespecific proteins described herein so long as they modified or variantpolypeptides have the enzymatic anabolic or catabolic activity of thereference polypeptide. Furthermore, the amino acid sequences encoded bythe DNA sequences shown herein merely illustrate embodiments of thedisclosure.

The recombinant yeast cellulosome of the disclosure can be engineeredbased upon the cellulosic material to be metabolized. For example,different cellulases and other enzymes may be engineered into acellulosome pathway depending upon the source of substrate. Exemplarysubstrate sources include alfalfa, corn stover, crop residues, debarkingwaste, forage grasses, forest residues, municipal solid waste, papermill residue, pomace, sawdust, spent grains, spent hops, switchgrass,and wood chips. Some substrate sources can have a larger percentage ofcellulose compared to other source, which may have a larger percentageof hemicellulose. A hemicellulose substrate comprises short, branchedchains of sugars and can comprise a polymer of five different sugars.Hemicellulose comprises five-carbon sugars (e.g., D-xylose andL-arabinose) and six-carbon sugars (e.g., D-galactose, D-glucose, andD-mannose) and uronic acid. The sugars are typically substituted withacetic acid. Hemicellulose is relatively easy to hydrolyze to itsconstituent sugars. When hydrolyzed, the hemicellulose produces xylose(a five-carbon sugar) or six-carbon sugars from hardwoods or softwoods,respectively.

In some instances the feedstock can be pretreated using heat, acidtreatment or base treatment. Possible pre-treatments include the use ofdilute acid, steam explosion, ammonia fiber explosion (AMFE), organicsolvents (BioCycle, May 2005 News Bulletin, and see: Ethanol fromCellulose: A General Review, P. Badger, p. 17-21 in J. Janick and A.Whipkey (eds.), Trends in New Crops and Uses, ASHS Press, 2002).Typically the pretreatment will be biocompatible or neutralized prior tocontact with a recombinant microorganism of the disclosure.

Proteins or polypeptides having the ability to convert the hemicellulosecomponents into carbon sources that can be used as a substrate forbiofuel production includes, for example, cellobiohydrolases(Accessions: AAC06139, AAR87745, EC 3.2.1.91, 3.2.1.150), cellulases(E.C. 3.2.1.58, 3.2.1.4, Accessions: BAA12070, BAB64431), chitinases(E.C. 3.2.1.14, 3.2.1.17, 3.2.1.-, 3.2.1.91, 3.2.1.8, Accessions:CAA93150, CAD12659), various endoglucanases (E.C. 3.2.1.4, Accessions:BAA92430, AAG45162, P04955, AAD39739), exoglucanases (E.C. 3.2.1.91,Accessions: AAA23226), lichenases (E.C. 3.2.1.73, Accessions: P29716),mannanases (E.C. 3.2.1.4, 3.2.1.-, Accessions: CAB52403), pectate lyases(E.C. 4.2.2.2, Accessions: AAG59609), xylanase (E.C. 3.2.1.136,3.2.1.156, 3.2.1.8, Accessions: BAA33543, CAA31 109) and silase (E.C.3.2.2.-, 2.7.7.7, Accessions: CQ80097S).

Cellulases are a class of enzymes produced chiefly by fungi, bacteria,and protozoans that catalyze the hydrolysis of cellulose. However, thereare also cellulases produced by other types of organisms such as plantsand animals. Several different kinds of cellulases are known, whichdiffer structurally and mechanistically. The EC number for cellulaseenzymes is E.C.3.2.1.4. Assays for testing cellulase activity are knownin the art.

Polypeptides having xylanase activity are useful for including insynthetic cellulosomes. Xylanases is the name given to a class ofenzymes which degrade the linear polysaccharide beta-1,4-xylan intoxylose, thus breaking down hemicellulose. The EC number for xylanaseenzymes is E.C. 3.2.1.136, 3.2.1.156, 3.2.1.8. Assays for testingxylanase activity are known in the art.

Scaffold or structure polypeptides refer to peptides that do not haveenzymatic activity, but rather play a role as a building block. Forexample, scaffold or structure polypeptides as used herein includescaffoldin, cohesion, and cellulose binding polypeptides useful forgenerating the cellulosome structure for binding of enzyme(s) to thehost cell surface, or bind the cellulosic material degrading enzyme(s)to the cellulosic material carbon source. In one example, a recombinantcellulosome of the disclosure can comprise a recombinant scaffoldpolypeptide and/or a recombinant enzymatic polypeptide. For example, asynthetic cellulosome can have a cellulosic material degrading enzymedomain, and one or more structural domains. Depending on the structuralpeptide domain the synthetic cellulosome will bind to the carbon sourceand serve to place the cellulosic material degrading enzyme activity inclose physical proximity to the carbon source. In other examples thesynthetic cellulosome will have peptide sequences that bind thesynthetic cellulosome to the host cell surface function to place thecellulosic material degrading enzyme activity in close proximity to thecell surface.

Yeast of the disclosure are engineered to convert cellulosic material toa biofuel, such as ethanol, by engineering them to produce both asynthetic cellulosome and cellulose degrading enzymes. Multiplerecombinant yeast can be co-cultured to generate the cellulosome, eachof a plurality of recombinant yeast producing one or more, but not all,the elements of a function cellulosome.

One of ordinary skill in the art will appreciate that there are avariety of synthetic cellulosomes that can be made, for example,comprising a variety of degradation enzymes for a specific cellulose orhemicelluloses containing material, a variety of associated scaffoldinand cohesion molecules to generate a recombinant cellulosome having adesired efficiency or pathway of degrading enzymes provided herein.

A microorganism (e.g., a yeast) are engineered to express the syntheticcellulosome by constructing a vector containing a scaffoldin domain, aCarbohydrate Binding Domain (CBM), and one or more cellulosicmaterial-degrading enzymes that have been fused with cohesin domains. Inone embodiment, a plurality of different recombinant microorganisms aregenerated each expressing a desired element of a cellulosome. Forexample, a first recombinant yeast can comprise a polynucleotideencoding a heterologous anchoring protein, a second recombinant yeastcomprising a polynucleotide encoding a soluble scaffoldin unitcomprising a plurality of cohesion units and a cellulose binding domain,and a third recombinant yeast comprising a polynucleotide encoding anenzyme (e.g., a cellulose or other degradation enzyme) linked to adockerin subunit. The cells can be recombinant engineered to produce theelements of the cellulosome, wherein a function cellulosome is producedand function in culture.

The following examples are intended to illustrate but not limit thedisclosure. While they are typical of those that might be used, otherprocedures known to those skilled in the art may alternatively be used.

EXAMPLES Strains, Plasmids, and Media

Escherichia coli strain JM109 [endA1 recA1 gyrA96 thi hsdR17 (rK⁻mK⁺)relA1 supE44 Δ(lac-proAB)] was used as the host for geneticmanipulations. E. coli BL21(DE3) [F− ompT gal hsdSB (rB−mB−) dcm lonλDE3] was used as the production host for cellulase expression. S.cerevisiae strain EBY100 [MATa AGA1::GAL1-AGA1::URA3 ura3-52 trp1 leu2-1his3-200 pep4::HIS3 prb1-1.6R can1 GAL] was used for surface display ofscaffoldins. All E. coli cultures were grown in Luria-Bertani (LB)medium (10.0 g/liter tryptone, 5.0 g/liter yeast extract, 10.0 g/literNaCl) supplemented with either 100 μg/ml ampicillin or 50 μg/mlkanamycin. All yeast cultures were grown in SDC medium (20.0 g/literdextrose, 6.7 g/liter yeast nitrogen base without amino acids, 5.0g/liter Casamino Acids).

To display scaffoldins, a gene fragment coding for a scaffoldincontaining three cohesins from C. cellulolyticum, C. thermocellum, andR. flavefaciens and one CBD was amplified with plasmid pETscaf6 as thetemplate with forward primer F1NdeI(5′-TATAGCTAGCGGCGATTCTCTTAAAGTTACAGT-3′ [the boldface portion is arestriction endonuclease site]) and reverse primer R1SalI(5′-ATATGTCGACGTGGTGGTGGTGGTG-3′). The PCR product was then digested andligated into the surface display vector pCTCON2 to form pScaf-ctf.Similar procedures, except that the reverse primers were changed toRASalI (5′-ATATGTCGACATCTGACGGCGGTATTGTTGTTG-3′) and RBSalI(5′-ATATGTCGACTATATCTCCAACATTTACTCCAC-3′), were used for theconstruction of pSacf-c and pSacf-ct.

Plasmids pETEc and pETGf, encoding exoglucanase CelE (Ec) andendoglucanase CelG (Gf) of C. cellulolyticum fused to the dockerins fromC. cellulolyticum and R. flavefaciens, respectively. Plasmid pETAt,encoding a His6-tagged endoglucanase (CelA) and a dockerin from C.thermocellum (At), was obtained by PCR from pCelA with forward primerF2NdeI (5′-ATATCATATGGCAGGTGTGCCTTTTAACACAAA-3′) and reverse primerR2XhoI (5′-ATATCTCGAGCTAATAAGGTAGGTGGGG-3′). The amplified fragment wascloned into NdeI-XhoI-linearized plasmid pET24a to form pETAt. PlasmidpBglAf, encoding a His6-tagged dockerin from R. flavefaciens fused to aβ-glucosidase (BglA) from C. thermocellum, was obtained by two-stepcloning. First, a gene fragment coding for the His6-tagged dockerin ofR. flavefaciens was obtained from pETGf by digestion with BamHI and XhoIand ligated into pET24a to form pETDf. The gene fragment of BglA wasamplified by PCR from pBglA with forward primer F3NdeI(5′-ATATCATATGTCAAAGATAACTTTCCCAAAA-3′) and reverse primer R3BglII(5′-ATATAGATCT TTAAAAACCGTTGTTTTTGATTACT-3′) and inserted intoNdeI-BamHI-linearized pETDf to form pBglAf. A summary of all of thescaffoldins and dockerin-tagged cellulases used in this study is listedin Table 1.

TABLE 1 Scaffoldins and dockerin-tagged cellulases used in this studyProtein name Description (from N terminus to C terminus) Host cell TagScaf-c Scaffoldin containing a cohesin from C. cellulolytica followed bya CBD S. cerevisiae c-Myc Scaf-ct Scaffoldin containing a cohesin fromC. cellulolytica followed by a CBD S. cerevisiae c-Myc followed by asecond cohesin from C. thermocellum Sacf-ctf Scaffoldin containing acohesin from C. cellulolytica followed by a CBD S. cerevisiae c-Mycfollowed by a second cohesin from C. thermocellum and a third cohesinfrom R. flavefaciens At Endoglucanase CelA from C. thermocellum fusedwith its native dockerin E. coli c-His₆ Ec Exoglucanase CelE from C.cellulolytica fused with its native dockerin E. coli c-His₆ GfEndoglucanase CelG from C. cellulolytica fused with a dockerin from E.coli c-His₆ R. flavefaciens BglA β-Glucosidase BglA from C. thermocellumfused with a dockerin from E. coli c-His₆ R. flavefaciens

A plasmid coding for a trifunctional scaffoldin (Scaf-ctf (SEQ ID NO:1and 2) consisting of an internal CBD flanked by three divergent cohesindomains from C. thermocellum (t), C. cellulolyticum (c), and R.flavefaciens (f) (FIG. 3) was created for surface display. To furtherdemonstrate the specificity of the different dockerin-cohesin pairs, twosmaller scaffoldins, (i) Scaf-c containing a cohesin domain from C.cellulolyticum followed by a CBD and (ii) Scaf-ct containing anadditional cohesin domain from C. thermocellum at the C terminus of theCBD, were generated (FIG. 3). The different scaffoldins were displayedon the yeast cell surface by using the glycosylphosphatidyl-inositol(GPI) anchor linked at the N-terminal side of the scaffoldins. A c-Myctag was added to the C terminus of each scaffoldin to allow detectionwith antic-Myc serum.

Display of Scaffoldins on the Yeast Cell Surface.

For the display of scaffoldins on the yeast cell surface, yeast cellsharboring pScaf-c, pScaf-ct, or pScaf-ctf were precultured in SDC mediumfor 18 h at 30° C. These precultures were subinoculated into 200 ml SGCmedium (20.0 g/liter galactose, 6.7 g/liter yeast nitrogen base withoutamino acids, 5.0 g/liter Casamino Acids) at an optical density (OD) at600 nm of 0.1 and grown for 48 h at 20° C.

Expression and Purification of Dockerin-Tagged Cellulases.

E. coli strains expressing At, Ec, and Gf were precultured overnight at37° C. in LB medium supplemented with appropriate antibiotics. Theprecultures were subinoculated into 200 ml LB medium supplemented with1.5% glycerol and appropriate antibiotics at an initial OD of 0.01 andincubated at 37° C. until the OD reached 1.5. The cultures were thencooled to 20° C., and isopropyl-β-D-thiogalactopyranoside (IPTG) wasadded to a final concentration of 200 μM. After 16 h, cells wereharvested by centrifugation (3,000×g, 10 min) at 4° C., resuspended inbuffer A (50 mM Tris-HCl [pH 8.0], 100 mM NaCl, 10 mM CaCl₂), and lysedwith a sonicator. The different cellulases were purified with aHis-binding resin (Novagen) at 4° C.

Minicellulosome Assembly on the Yeast Cell Surface.

To assemble the minicellulosomes, either cell lysates containingdockerin-tagged cellulases or purified cellulases were incubated withyeast cells displaying the scaffoldin for 1 h at 4° C. in buffer A.After incubation, cells were washed and harvested by centrifugation(3,000×g, 10 min) at 4° C. and resuspended in the same buffer forfurther use.

Immunofluorescence Microscopy.

Yeast cells displaying scaffoldins or the minicellulosomes on thesurface were harvested by centrifugation, washed with phosphate-bufferedsaline (PBS; 8 g/liter NaCl, 0.2 g/liter KCl, 1.44 g/liter Na₂HPO₄, 0.24g/liter KH₂PO₄), and resuspended in 250 μl of PBS containing 1 mg/mlbovine serum albumin and 0.5 μg of anti-c-Myc or anti-c-Hisimmunoglobulin G (IgG; Invitrogen) for 4 h with occasional mixing. Cellswere then pelleted and washed with PBS before resuspension in PBS plus 1mg/ml bovine serum albumin and 0.5 μg anti-mouse IgG conjugated withAlexa 488 (Molecular Probes). After incubation for 2 h, cells werepelleted, washed twice with PBS, and resuspended in PBS to an OD at 600nm of 1. For fluorescence microscopy (Olympus BX51), 5- to 10-_l volumesof cell suspensions were spotted onto slides and a coverslip was added.Images from Alexa 488 were captured with the QCapture Pro6 software.Whole-cell fluorescence was measured with a fluorescence microplatereader (Synergy4; BioTek, VT) at an excitation wavelength of 485 nm andan emission wavelength of 535 nm.

Enzyme Assays.

Carboxymethyl cellulose (CMC) was obtained from Sigma and used as asubstrate. Phosphoric acid-swollen cellulose (PASC) was prepared fromAvicel PH101 (Sigma) according to the method of Walseth (27). Enzymeactivity was assayed in the presence of a 0.3% (wt/vol) concentration ofcellulose at 30° C. in 20 mM Tris-HCl buffer (pH 6.0). Samples werecollected periodically and immediately mixed with 3 ml of DNS reagents(10 g/liter dinitrosalicylic acid, 10 g/liter sodium hydroxide, 2g/liter phenol, 0.5 g/liter sodium sulfite). After incubation at 95° C.for 10 min, 1 ml of 40% Rochelle salts was added to fix the color beforemeasurement of the absorbance of the supernatants at 575 nm. Glucoseconcentration was determined with a glucose HK assay kit from Sigma.

Fermentation.

Fermentation was conducted anaerobically at 30° C. Briefly, yeast cellswere washed once with buffer containing 50 mM Tris-HCl (pH 8.0), 100 mMNaCl, and 10 mM CaCl₂ and resuspended in SDC medium containing 6.7g/liter yeast nitrogen base without amino acids, 20 g/liter CasaminoAcids, and 10 g/liter PASC as the carbon source. Reducing sugarproduction and glucose concentration were measured by the methodsdescribed above. The amount of residual cellulose was measured by thephenol-sulfuric acid method as described by Dubois et al. Ethanolconcentration was measured with a gas chromatograph (model 6890; HewlettPackard) with a flame ionization detector and an HP-FFTP column.

To probe the surface localization of the scaffoldins, immunofluorescentlabeling of cells was carried out using anti-c-Myc sera and Alexa FluorTM 488 conjugated goat anti-mouse IgG (Molecular Probe) and observedunder a fluorescence microscope (Olympus America, Inc., San Diego,Calif.). Cells displaying the scaffoldin domains (1, 2, or 3) on thesurface were brightly fluorescence (FIG. 2), while no fluorescence wasobserved for the control yeast cells. These results demonstrate that asynthetic scaffoldin can be successfully displayed on the surface of aheterologous host (e.g., a yeast).

To test the functionality of the displayed scaffoldins, three differentcellulases fused with a corresponding dockerin domain (either c, t, orf) were expressed in E. coli (i.e., an exoglucanase (CelE) from C.cellulolyticum fused to a dockerin domain from the same species (Ec), anendoglucanase (CelG) from C. cellulolyticum fused to a dockerin domainfrom R. flavefaciens (Gf), and an endoglucanase (CelA) fused to adockerin domain from C. thermocellum (At) were expressed in E. coli).The plasmids used were: (i) pETEc containing an exoglucanase CelE fromC. cellulolytica fused with a dockerin domain from the same species (Ec;see, e.g., SEQ ID NO:3 and 4, Christian Gaudin et al., Journal ofBacteriology, 2000. 182: 1910-1915, incorporated herein by reference);(ii) pETGf containing an endoglucanase CelG from C. cellulolytica fusedwith a dockerin domain from R. flavefaciens (Gf; see, e.g., SEQ ID NO:5and 6; Henri-Pierre Fierobe, et al. The Journal of Biological Chemistry.2005. 280:16325-16334, incorporated herein by reference); and (iii)pETAt containing an endoglucanase CelA fused with a dockerin domain fromC. thermocelum (At; see, e.g., SEQ ID NO:7 and 8; Dae-Kyun Chung et al.,Biotechnology Letters. 1997, 19:503-506, incorporated herein byreference). A His6 tag was added to the C terminus of each of thedockerin domains for detection of the assembly. Cells displayingscaffoldins on the surface were incubated directly with E. coli celllysates containing At, Ec, or Gf for 1 h to form the cellulosomecomplex. The presence of each cellulase-dockerin pair on cellsdisplaying Scaf-ctf was confirmed by immunofluorescence microscopy withthe anti-His6 antibody (FIG. 2B).

To demonstrate the specificity of different cohesin-dockerin pairs,similar experiments were performed with cells displaying either Scaf-ctor Scaf-c. In Scaf-ct-displaying cells, fluorescence was detected onlyin the presence of Ec or At, whereas incubation with Gf did not resultin any detectable fluorescence (FIG. 2C). Similarly, inScaf-c-displaying cells, fluorescence was only observed in the presenceof Ec (FIG. 2D). These results confirm that the specificity of thecohesins is preserved even when they are displayed on the cell surface,as only the corresponding dockerin-tagged enzymes are assembledcorrectly.

Functionality of the Displayed Minicellulosomes.

To demonstrate the functionality of the assembled minicellulosomes,cells expressing Scaf-ctf were first saturated with differentcombinations of Ec, At, and/or Gf. As depicted in FIG. 3, a similarlevel of fluorescence was detected from the c-Myc or c-His6 tag whenonly one dockerin-tagged enzyme was added, indicating the correct 1:1binding between the cohesin-dockerin pairs. A corresponding increase influorescent intensity was observed when an increasing number of enzymeswere docked on Scaf-ctf. This result confirms that the correct 1:1binding ratio of each dockerin-cohesin pair was preserved even when itwas assembled into a three-enzyme minicellulosome on the cell surface(FIG. 3).

Engineered yeast cells docked with different combinations of cellulaseswere further examined for functionality in cellulose hydrolysis. Cellswere resuspended in Tris buffer containing CMC, and the rate of reducingsugar production was determined. As shown in FIG. 4, cells with any oneof the three cellulases docked on the surface showed visible differencesin cellulose hydrolysis from the control. The endoglucanase At had thehighest rate of hydrolysis, followed by Gf and Ec, a trend consistentwith the relatively low activity of the exoglucanase CelE on CMC. Therate of CMC hydrolysis increased in an additive fashion when two of thecellulases were docked on the cell surface, and the highest rate ofhydrolysis was observed when all three cellulases were assembled. Theadditive effect on CMC hydrolysis confirms that the recruitment ofcellulases to the displayed scafoldin has a very minimum effect on theirindividual functionality.

Synergistic Effect of Displayed Minicellulosomes.

The synergistic effect on cellulose hydrolysis is an intriguing propertyof naturally occurring cellulosomes. To test whether the synergisticeffect of the minicellulosome structure was preserved when displayed onthe yeast cell surface, Avicel hydrolysis was compared with that ofpurified cellulases. In this case, the amount of each cellulase dockedon Scaf-ctf was first determined from the binding experiments. Thesepredetermined amounts of cellulases were then mixed together, and thehydrolysis of Avicel with the cellulase mixture was compared with thatof whole cells displaying the functional cellulosome containing the sameamount of each cellulase. As shown in Table 2, the level of reducingsugar production was consistently higher for cells displaying thecellulosome, confirming that synergy was indeed maintained. The level ofsynergy increased from 1.62 to 2.44 when the number of cellulasesrecruited in the minicellulosome system increased from one to three.This result suggests the potential to further enhance cellulosehydrolysis by increasing the number of displayed cellulases.

TABLE 2 Amounts of reducing sugars released from Avicel after 24 h ofincubation at 30° C. either by cells displaying cellulosomes or by thesame amount of free enzymes^(a) Amt of reducing sugars (mg/liter)Cellulase released from: Degree of pair(s) Cellulosome Free enzymessynergy At 46.1 28.3 1.62 At + Ec 80.1 37.6 2.13 At + Ec + Gf 132.3 54.22.44 ^(a)Reactions were conducted either with different cellulase pairs(CelE-Dc [Ec], CelA-Dt [At], or CelG-Df [Gf]) docked on the displayedScaf-ctf or with the corresponding purified cellulases. The degree ofsynergy is defined as the amount of sugar released from the cellulosomeover the amount of sugar released from free enzymes.

Incorporation of β-Glucosidase into the Minicellulosome.

Since S. cerevisiae is unable to transport and utilize oligosaccharides,directing the complete hydrolysis of cellulose to glucose is essential.To achieve this goal, a β-glucosidase (BglA) from C. thermocellum taggedwith the dockerin from R. flavefaciens was constructed. The resultingdockerin-tagged BglA retained the same specificity and dockingefficiency as Gf (FIG. 3). FIG. 5 shows the time course of reducingsugar and glucose released from PASC with different enzyme combinationsdocked on the cell surface. Although 40% of the PASC was hydrolyzed inthe presence of the endoglucanase At, 25% of the reducing sugar wasfurther hydrolyzed to glucose.

In comparison, the presence of the exoglucanase Ec not only enhancedreducing sugar production but also increased glucose productionthreefold. The addition of BglA further improved the rate of glucoseliberation, although no difference in reducing sugar formation wasobserved. This result is very significant, as demonstrated, a functionalminicellulosome containing all three exoglucanase, endoglucanase, andβ-glucosidase activities can be successfully assembled on the surface ofa heterologous host cell. The result also confirms a role ofβ-glucosidase in achieving a higher conversion of cellulose to glucose.The displayed minicellulosome exhibited synergy in both reducing sugarand glucose liberation compared to that of free enzymes.

Direct Fermentation of Amorphous Cellulose to Ethanol.

The ability of ethanol fermentation from PASC was examined by using thescaffoldin-displaying strains docked with different cellulases. As shownin FIG. 6, the increase in ethanol production was accompanied by aconcomitant decrease in the total sugar concentration. The levels ofethanol production and PASC hydrolysis were directly correlated with thenumber of cellulases docked on the cell surface. The maximum ethanolproduction of cells displaying At, Ec, and BglA was 3.5 g/liter after 48h; this corresponds to 95% of the theoretical ethanol yield, at 0.49 gethanol/g sugar consumed. Moreover, the glucose concentrations duringthe fermentation were below the detection limit. This indicates that allof the glucose produced was quickly consumed, resulting in no detectableglucose accumulation in the medium. The level of ethanol production bycells displaying all three cellulases was higher than that of cellsdisplaying only At and Ec, again confirming the importance ofβ-glucosidase in the overall cellulose-to-ethanol conversion process.More importantly, the synergistic effect of the minicellulosome was alsoobserved, as the ethanol production by a culture with the same amountsof purified At, Ec, and BglA added to the medium was more than threefoldlower.

The feasibility of using secreted cellulases for the direct assembly offunctional cellulosome has also been demonstrated. The yeast secretionvector pCEL15 containing the secretion leader sequence MFα1 was used forinserting the gene coding for At. Cells carrying the secretion vectorwere co-cultured with cells displaying scaff#3 for 24 h. The correctassembly of the secreted At onto the cell surface was again verified byimmunofluorescence microscopy. The assembled At remained active asdemonstrated by the ability to hydrolyze Avicel. By changing theinoculation densities of the two cultures, different levels of Atactivity associated with cells and remained in the medium were detected.These results confirm the possibility of fine-tuning the assembly offunctional cellulosomes on the cell surface using an engineeredconsortium of cells performing separate functions.

Overall, the results demonstrated the successful functional assembly ofa mini-cellulosome on the yeast surface. The displayed mini-cellulosomesenable the cells to hydrolyze cellulose and grow using cellulose as thesole carbon source. Moreover, the increased cell growth and reducingsugar production with increasing cellulases docked on the surfaceindicates the potential to further increase the efficiency of cellulosehydrolysis by increasing the number of displayed cellulases via the useof more complex cellulosome structures.

A number of embodiments of the invention have been described.Nevertheless, it will be understood that various modifications may bemade without departing from the spirit and scope of the invention.Accordingly, other embodiments are within the scope of the followingclaims.

Sequence of SCAF6ATGGGCGATTCTCTTAAAGTTACAGTAGGAACAGCTAATGGTAAGCCTGGCGATACAGTAACAGTTCCTGTTACATTTGCTGATGTAGCAAAGATGAAAAACGTAGGAACATGTAATTTCTATCTTGGATATGATGCAAGCCTGTTAGAGGTAGTATCAGTAGATGCAGGTCCAATAGTTAAGAATGCAGCAGTTAACTTCTCAAGCAGTGCAAGCAACGGAACAATCAGCTTCCTGTTCTTGGATAACACAATTACAGACGAATTGATAACTGCAGACGGTGTGTTTGCAAATATTAAGTTCAAATTAAAGAGTGTAACGGCTAAAACTACAACACCAGTAACATTTAAAGATGGTGGAGCTTTTGGTGACGGAACTATGTCAAAGATAGCTTCAGTTACTAAGACAAACGGTAGTGTAACGATCGATCCGACCAAGGGAGCAACACCAACAAATACAGCTACGCCGACAAAATCAGCTACGGCTACGCCCACCAGGCCATCGGTACCGACAAACACACCGACAAACACACCGGCAAATACACCGGTATCAGGCAATTTGAAGGTTGAATTCTACAACAGCAATCCTTCAGATACTACTAACTCAATCAATCCTCAGTTCAAGGTTACTAATACCGGAAGCAGTGCAATTGATTTGTCCAAACTCACATTGAGATATTATTATACAGTAGACGGACAGAAAGATCAGACCTTCTGGTGTGACCATGCTGCAATAATCGGCAGTAACGGCAGCTACAACGGAATTACTTCAAATGTAAAAGGAACATTTGTAAAAATGAGTTCCTCAACAAATAACGCAGACACCTACCTTGAAATAAGCTTTACAGGCGGAACTCTTGAACCGGGTGCACATGTTCAGATACAAGGTAGATTTGCAAAGAATGACTGGAGTAACTATACACAGTCAAATGACTACTCATTCAAGTCTGCTTCACAGTTTGTTGAATGGGATCAGGTAACAGCATACTTGAACGGTGTTCTTGTATGGGGTAAAGAACCCGGTGGCAGTGTAGTACCATCAACACAGCCTGTAACAACACCACCTGCAACAACAAAACCACCTGCAACAACAAAACCACCTGCAACAACAATACCGCCGTCAGATGATCCGAATGCAATAAAGATTAAGGTGGACACAGTAAATGCAAAACCGGGAGACACAGTAAATATACCTGTAAGATTCAGTGGTATACCATCCAAGGGAATAGCAAACTGTGACTTTGTATACAGCTATGACCCGAATGTACTTGAGATAATAGAGATAAAACCGGGAGAATTGATAGTTGACCCGAATCCTGACAAGAGCTTTGATACTGCAGTATATCCTGACAGAAAGATAATAGTATTCCTGTTTGCAGAAGACAGCGGAACAGGAGCGTATGCAATAACTAAAGACGGAGTATTTGCTACGATAGTAGCGAAAGTAAAATCCGGAGCACCTAACGGACTCAGTGTAATCAAATTTGTAGAAGTAGGCGGATTTGCGAACAATGACCTTGTAGAACAGAGGACACAGTTCTTTGACGGTGGAGTAAATGTTGGAGATATAGGATCCGCCGGTGGTTTATCCGCTGTGCAGCCTAATGTTAGTTTAGGCGAAGTACTGGATGTTTCTGCTAACAGAACCGCTGCTGACGGAACAGTTGAATGGCTTATCCCAACAGTAACTGCAGCTCCAGGCCAGACGGTCACTATGCCCGTAGTAGTCAAGAGTTCAAGTCTTGCAGTTGCTGGTGCGCAGTTCAAGATCCAGGCGGCGACAGGCGTAAGTTATTCGTCCAAGACGGACGGTGACGCTTACGGTTCAGGCATTGTGTACAATAATAGTAAGTATGCTTTTGGACAGGGTGCAGGTAGAGGAATAGTTGCAGCTGATGATTCGGTTGTGCTTACTCTTGCATATACAGTTCCCGCTGATTGTGCTGAAGGTACATATGATGTCAAGTGGTCTGATGCGTTTGTAAGTGATACAGACGGACAGAATATCACAAGTAAGGTTACTCTTACTGATGGCGCTATCATTGTTAAGTAGSequence of EcATGCTTGTTGGGGCAGGAGATTTGATTCGAAACCATACCTTTGACAACAGAGTAGGTCTTCCATGGCACGTGGTTGAATCATACCCTGCAAAGGCAAGTTTTGAAATTACATCTGATGGTAAGTACAAGATAACTGCTCAAAAGATCGGTGAGGCAGGAAAAGGTGAAAGATGGGATATACAATTCCGTCACAGAGGACTCGCATTGCAACAAGGTCATACTTATACAGTAAAGTTTACTGTTACTGCTAGCAGAGCTTGTAAAATTTATCCTAAAATAGGTGACCAGGGTGATCCATATGATGAATACTGGAATATGAATCAACAATGGAATTTCCTGGAATTACAGGCTAATACTCCAAAAACTGTAACTCAGACATTTACACAGACTAAGGGAGATAAGAAGAACGTTGAATTTGCTTTTCACCTTGCTCCCGATAAAACTACATCTGAGGCACAGAATCCAGCAAGTTTCCAACCTATAACATATACTTTTGATGAAATTTATATTCAGGACCCTCAATTTGCAGGATATACTGAAGATCCACCTGAACCTACTAATGTTGTACGTTTGAATCAGGTAGGTTTCTATCCTAATGCTGATAAGATTGCAACAGTAGCAACAAGTTCAACAACTCCAATTAACTGGCAGTTGGTTAATAGTACTGGAGCAGCTGTTTTAACAGGTAAATCAACTGTTAAAGGTGCCGACCGTGCATCAGGTGATAATGTCCATATCATTGATTTCTCTAGTTACACAACACCTGGTACCGACTATAAGATAGTAACAGATGTATCAGTAACAAAAGCCGGAGACAATGAAAGTATGAAGTTCAATATTGGAGATGACCTTTTTACTCAAATGAAATACGATTCAATGAAGTATTTCTATCACAACAGAAGTGCTATTCCAATACAAATGCCATACTGTGATCAATCACAATGGGCACGTCCTGCAGGACACACAACTGATATACTTGCTCCAGATCCAACAAAGGATTACAAGGCTAACTACACACTTGACGTTACAGGTGGTTGGTATGATGCCGGTGACCATGGTAAGTATGTTGTTAATGGTGGTATTGCAACCTGGACCGTAATGAATGCATATGAGCGTGCACTACACATGGGTGGAGACACTTCAGTTGCTCCATTTAAAGACGGTTCTTTAGCAATACCAGAAGCGGAAGTCTATCCTGACATACTGGACGAAGCTCGTTACCAGCTCATTAACATGAAAACATTATTAAATAGTCAGGTTCCAGCAGGAAAGTATGCGGGTATGGCTCACCACAAAGCTCATGACGAACGTTGGACAGCTCTTGCTGTACGTCCCGACCAGGATACAATGAAACGTTGGTTGCAGCCTCCAAGTACAGCAGCTACATTAAATCTGGCTGCTATTGCTGCACAAAGTTCACGTCTTTGGAAACAGTTTGATTCTGCTTTCGCAACTAAGTGTTTAACTGCAGCAGAAACTGCTAGGGATGCAGCTGTAGCTCATCCAGAAATATATGCAACTATGGAACAGGGTGCCGGTGGTGGAGCATACGGAGACAACTATGTTCTTGATGATTTCTACTGGGCAGCATGTGAATTGTATGCAACTACAGGCAGTGACAAGTATTTGAACTACATAAAGAGCTCAAAGCATTATCTCGAAATGCCTACAGAATTAACAGGCGGTGAGAATACTGGAATTACAGGGGCTTTTGACTGGGGTTGTACAGCAGGTATGGGAACAATAACACTTGCACTTGTACCTACAAAGCTTCCGGCAGCAGATGTTGCTACAGCTAAAGCTAATATTCAAGCTGCAGCTGATAAGTTCATATCAATTTCAAAAGCACAAGGCTATGGTGTACCACTAGAAGAAAAAGTAATTTCATCTCCTTTTGATGCATCTGTTGTTAAAGGTTTCCAATGGGGATCAAACTCATTCGTTATTAATGAAGCAATAGTTATGTCATATGCTTATGAATTCAGCGATGTTAATGGCACAAAGAATAATAAATATATTAATGGTGCTTTAACAGCAATGGATTACCTCCTCGGACGTAACCCAAATATTCAAAGCTATATAACTGGTTATGGTGACAACCCACTTGAAAATCCTCATCACCGTTTCTGGGCATACCAGGCAGACAACACATTCCCAAAACCACCTCCGGGATGTCTGTCAGGAGGACCTAACTCCGGCTTGCAGGATCCTTGGGTTAAGGGTTCAGGCTGGCAGCCAGGTGAAAGACCTGCTGAAAAATGCTTCATGGACAATATTGAATCTTGGTCAACAAACGAAATAACCATCAACTGGAATGCTCCTCTTGTATGGATATCAGCTTACCTTGATGAAAAGGGGCCAGAGATTGGTGGGTCAGTGACTCCTCCAACTAATTTAGGAGATGTTAACGGCGATGGAAACAAGGATGCATTGGACTTCGCTGCATTGAAGAAAGCCTTGTTAAGCCAGGATACTTCTACTATAAATGTTGCTAATGCTGATATAAACAAAGATGGTTCTATTGATGCAGTTGACTTTGCATTACTCAAATCATTCTTGTTAGGAAAAATCACACAGTGASequence of GfATGGGAACATATAACTATGGAGAAGCATTACAGAAATCAATAATGTTCTATGAATTCCAGCGTTCGGGAGATCTTCCGGCTGATAAACGTGACAACTGGAGAGACGATTCCGGTATGAAAGACGGTTCTGATGTAGGAGTTGATCTTACAGGAGGATGGTACGATGCAGGTGACCATGTGAAATTTAATCTACCTATGTCATATACATCTGCAATGCTTGCATGGTCCTTATATGAGGATAAGGATGCTTATGATAAGAGCGGTCAGACAAAATATATAATGGACGGTATAAAATGGGCTAATGATTATTTTATTAAATGTAATCCGACACCCGGTGTATATTATTACCAAGTAGGAGACGGCGGAAAGGACCACTCTTGGTGGGGCCCTGCGGAAGTAATGCAGATGGAAAGACCGTCTTTTAAGGTTGACGCTTCTAAGCCCGGTTCTGCAGTATGTGCTTCCACTGCAGCTTCTCTGGCATCTGCAGCAGTAGTCTTTAAATCCAGTGATCCTACTTATGCAGAAAAGTGCATAAGCCATGCAAAGAACCTGTTTGATATGGCTGACAAAGCAAAGAGTGATGCTGGTTATACTGCGGCTTCAGGCTACTACAGCTCAAGCTCATTTTACGATGATCTCTCATGGGCTGCAGTATGGTTATATCTTGCTACAAATGACAGTACATATTTAGACAAAGCAGAATCCTATGTACCGAATTGGGGTAAAGAACAGCAGACAGATATTATCGCCTACAAGTGGGGACAGTGCTGGGATGATGTTCATTATGGTGCTGAGCTTCTTCTTGCAAAGCTTACAAACAAACAATTGTATAAGGATAGTATAGAAATGAACCTTGACTTCTGGACAACTGGTGTTAACGGAACACGTGTTTCTTACACGCCAAAGGGTTTGGCGTGGCTATTCCAATGGGGTTCATTAAGACATGCTACAACTCAGGCTTTTTTAGCCGGTGTTTATGCAGAGTGGGAAGGCTGTACGCCATCCAAAGTATCTGTATATAAGGATTTCCTCAAGAGTCAAATTGATTATGCACTTGGCAGTACCGGAAGAAGTTTTGTTGTCGGATATGGAGTAAATCCTCCTCAACATCCTCATCACAGAACTGCTCACGGTTCATGGACAGATCAAATGACTTCACCAACATACCACAGGCATACTATTTATGGTGCGTTGGTAGGAGGACCGGATAATGCAGATGGCTATACTGATGAAATAAACAATTATGTCAATAATGAAATAGCCTGCGATTATAATGCCGGATTTACAGGTGCACTTGCAAAAATGTACAAGCATTCTGGCGGAGATCCGATTCCAAACTTCAAGGCTATCGAAAAAATAACCAACGATGAAGTTATTATAAAGGCAGGTTTGAATTCAACTGGCCCTAACTACACTGAAATCAAGGCTGTTGTTTATAACCAGACAGGATGGCCTGCAAGAGTTACCGACAAGATATCATTTAAATATTTTATGGACTTGTCTGAAATTGTACCAGCAGGAATTGATCCTTTAAGCCTTGTAACAACTTCAAATTATTCTGAAGGTAAGAATACTAAGGTTTCCGGTGTGTTGCCATGGGATGTTTCAAATAATGTTTACTATGTAAATGTTGATTTGACAGGAGAAAATATCTACCCAGGCGGTCAGTCTGCGTGCAGACGAGAACTTCACTTCAGAATTGCCGCACCACAGGGAAGAAGATATTGGAATCCGAAAAATGATTTCTCATATGATGGATTACCAACCACCAGTACTGTAAATACCGTTACCAACATACCTGTTTATGATAACCGCGTAAAAGTATTTGGTAACGAACCCGCAGGTGGATCAGAACCCGCCACAAAGCTCGTTCCTACATGGGGCGATACAAACTGCGACCGCGTTGTAAATGTTGCTGACGTAGTACTTCTTAACAGATTCCTCAACGATCCTACATATTCTAACATTACTGATCAGGGTAAGGTTAACGCAGACCTTGTTGATCCTCAGGATAAGTCCGGCGCACCAGTTGATCCTGCAGGCGTAAAGCTCACAGTAGCTGACTCTGAGGCAATCCTCAAGGCTATCGTTGAACTCATCACACTTCCTCAATGASequence of AtATGGCAGGTGTGCCTTTTAACACAAAATACCCCTATGGTCCTACTTCTATTGCCGATAATCAGTCGGAAGTAACTGCAATGCTCAAACCAGAATGGGAAGACTGGAAGACCAAGAGAATTACCTCGAACCGTGCAGGAGGATACAAGAGAGTACACCGTGATGCTTCCACCAATTATGATACCGTATCCGAAGGTATGGGATACCGACTTCTTTTGGCGGTTTGCTTTAACGAACAGGCTTTGTTTGACGATTTATACCGTTACGTAAAATCTCATTTCAATGGAAACCGACTTATGCACTGGCACATTGATGCCAACAACAATGTTACAAGTCATGACCGCGGCGACCGTGCGGCAACCGATGCTGATGAGGATATTGCACTTGCGCTCATATTTGCGGACAAGTTATGGGGTTCTTCCGCTGCAATAAACTACCGCCAGGAACCAAGGACATTGATAAACAATCTTTACAACCATTGTGTAGACCATGGATCCTATGTATTAAAGCCCGGTGACAGATGGGGAGGTTCATCAGTAACAAACCCGTCATATTTTGCGCCTGCATGGTACAAAGTGTATGCTCAATATACAGGAGACACAAGATGGAATCAAGTGGCGGACAAGTGTTACCAAATTGTTGAAGAAGTTAAGAAATACAACAACCGAACCGGCCTTGTTCCTGACTGGTGTACTGCAACCGGAACTCCGCCAACCGGTCAGAGTTACGACTACAAATATGATGCTACACGTTACCGCTGGAGAACTGCCGTGGACTATTCATGGTTTGGTGACCAGAGACCAAAGGCAAACTGCGATATGCTGACCAAATTCTTTGCCAGAGACCGGGCAAAAGGAATCGTTGACCGATACACAATTCAAGGTTCAAAAATTACCAACAATCACAACGCATCAITTATAGGACCTGTTGCGCCACCAAGTATGACAGGTTACGATTTGAACTTTGCAAAGGAACTTTATAGGGAGACTGTTGCTGTAAAGGACAGTGAATATTACCGATATTACCGAAACACCTTGAGACTGCTCACTTTGTTGTACATAACAGGAAACTTCCCGAATCCTTTGAGTGACCTTTCCGGCCAACCGACACCACCGTCGAATCCGACACCTTCATTGCCTCCTCAGGTTGTTTACCGTGATGTAAATGGCGACCGTAATGTTAACTCCACTGATTTGACTATGTTAAAAAGATATCTGCTGAAGAGTGTTACCAATATAAACAGAGAGGCTGCAGACGTTAATCGTGACCGTGCGATTAACTCCTCTGACATGACTATATTAAAGAGATATCTGATAAAGACCATACCCCACCTACCTTATTAG

What is claimed is:
 1. A recombinant yeast cell comprising: aheterologous polynucleotide that encodes a secreted polypeptidecomprising a dockerin domain, a scaffoldin polypeptide, one or morecellulose binding domains (CBD) and a plurality ofdivergent-heterologous cohesion domains.
 2. The recombinant yeast cellof claim 1, wherein the polypeptide comprises from N- to C-terminus: thedockerin polypeptide, the plurality of divergent-heterologous cohesiondomains and the CBD.
 3. The recombinant yeast cell of claim 1, whereinthe dockerin domain is from B. cellulosolvens or C. acetobutylicum.
 4. Arecombinant yeast cell comprising: a heterologous polynucleotideencoding a polypeptide comprising a surface anchor antigen (AGA) domain,a scaffoldin domain and at least one cohesion domain.
 5. The recombinantyeast cell of claim 4, wherein the AGA domain is AGA1 or AGA2.